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Abstract 

In this study we analyze one year of anonymized telecommunications data for over one million customers 
from a large European cellphone operator, and we investigate the relationship between people's calls and 
their physical location. We discover that more than 90% of users who have called each other have also 
shared the same space (cell tower), even if they live far apart. Moreover, we find that close to 70% of 
users who call each other frequently (at least once per month on average) have shared the same space 
at the same time - an instance that we call co-location. Co-locations appear indicative of coordination 
calls, which occur just before face-to-face meetings. Their number is highly predictable based on the 
amount of calls between two users and the distance between their home locations - suggesting a new way 
to quantify the interplay between telecommunications and face-to- face interactions. 



Introduction 



The interplay between telecommunications, travel and face-to-face meetings is an unresolved puzzle. In 
some cases it has been suggested that telecommunications may be a substitute for physical interaction [l] 
- an idea that gained traction during the nineties and the rapid expansion of the Internet [2j|3]. In 
other cases conflicting hypotheses have been made, including those of a complementary |4j[5], neutral [6] 
or reinforcing [t] effect. Recently, social networks have been identified as possible predictors of travel 
behavior, as well as the possible decision to telecommute [Sjjo] . Social interaction has thus been integrated 
in activity-travel models 10 , in addition to the existing categories of travel such as commuting, leisure 
and business. Furthermore, researchers such as Urry and others 
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have argued that flows and 
meetings of people produce small worlds, which require connections and meeting places - a phenomenon 
which is also known as the new mobilities paradigm. 

This study aims to provide a new perspective into the relationship between telecommunicating people 
and their physical locations through an assesment of anonymized Call Detail Records (CDRs). CDRs show 
great promise for academic research: they have recently been used to explore human communications [14j 



15 , the geography of social networks 16 17 , urban dynamics 18 , and human mobility patterns 19 -22 



In this paper we use them for the first time to study the relationship between the telecommunications 
patterns of any two people and their physical locations. 
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Results 

We use a large anonymized dataset of over one million mobile phone users, which was gathered in 
Portugal over a twelve month period between 2006 and 2007. To safeguard personal privacy, individual 
phone numbers were anonymized by the operator before leaving storage facilities, and they were identified 
with a security ID (hash code). Each entry in the dataset has a CDR, which consists of the following 
information: timestamp, callers ID, callees ID, call duration, callers cell tower ID, and callee's cell tower 
ID. This metadata on each call allows us to study both the mobile social interaction as well as the physical 
location of the users within the dataset. Notably, the dataset does not contain information regarding 
text messages (SMS) or data usage (internet). More details about the dataset can be found in Text SI. 

In this study, we start with the initial dataset and look at all communications between pairs of users, 
together with their locations at call time. As we are interested in comparing people locations, we discard 
users for which we do not have enough samples. We use two subsets: Dl, which contains all reciprocal 
communications between the top 100,000 callers; and D2, which contains 10,000 pairs from Dl, sampled 
at different home distances to ensure the same home distances distribution found in Dl (see Text S2). In 
the sequel, we use D2 in cases where computational complexity limits the use of a larger set. 

We discover that at least 93% of users in Dl who reciprocally call each other, have at least once shared 
the same cell tower area in one year. The percentage decreases slightly as the distances between their 
homes decreases, but the value is still above 90% for users living 100 km apart (see Figure 1). It appears 
that almost all remote communications are associated with being physically sharing space. It may also 
be noted that we are underestimating the percentage as our data is only based on locations at call time, 
so users might have also shared space without this being recorded in our data. Results are consistent 
with what was recently found analyzing spatio-temporal coincidences in a geo-tagged pictures database 



to infer social ties 23 



If we also consider the temporal component, we can look at how often and where users are sharing 
the same space at the same time. We restrict our attention to the case when two users call each other 
using the same cell tower. This scenario is based on the hypothesis that they are calling each other to 



coordinate to meet in a nearby area, also called "coordination knot" 24 . Of course, two people living 
or working close by could also call each other very often without physically meeting. So, we excluded 
users living or working in the same cell tower area, estimated as described in Text S2. We define a 
co-location event between two users (who live and work in distinct locations) as a call between the users 
while they are connected to the same cellphone tower. Each co-location is characterized by a specific time 
and place. Based on this definition, we characterize the spatio-temporal features of co-location events, to 
see whether they represent a reasonable subset of actual face-to-face meetings between users. 

Starting with the larger subset (Dl), we analyze the relationship between calling activity and user's 
locations. Among the pairs of communicating users, 400,000 cases have two users calling each other 
while in the same cell tower area, 350,000 of which have distinct home and work locations. Interestingly, 
38.33% of the communicating users co-locate at least once during the period examined. When stronger 
relationships are considered (users who call on average at least once per month) the percentage increases 
to 69.41%. 

Call duration appears to increase with the homes distance between users (see black line in Figure 2). 
Calls that occur between co-located people (red line) have a shorter average call duration, suggesting 
that people who co-locate call each other briefly to coordinate the exact meeting place and time. 

We also find that the number of calls between two users increases just before and after their co- 
location (Figure 3). The probability is rather constant in the interval, with two peaks around and 1 
(consecutive co-location events). The presence of these peaks suggests that the considered events (co- 
locations) represent a reasonable proxy for face-to-face meetings. In particular, a peak of calls just before 
the co-location event, suggests that the two people are talking on the phone to arrange a meeting, in line 



with what is hypothesized in 16 24 . The peak right after the co-location event might be explained by 



a follow up call after the meeting. 
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We analyze the features of co-location places and compared it with geographical and communication 
differences between users. We define di{l) and c?2(0 the distances traveled by two users at every 
co-location event ^ = f , . . . , m, and compute three measures of comparison: 

1. The median ratio between the shortest and longest distance at co-location time: 

min{di{l) , d2{l)} 

rd = mediani . 

max{di(l),d2(l)} 

2. The fraction of times user 1 travels less than its peer: 

m 

r„ = l/m^5(d2(/)-di(0), 

1=1 

where: 

, , f 1 if X > 
^'(^) = | ^fx^O ■ 

3. The fraction of times one of the users travels less than the peer: 

rt = mm{r ti, r 12}- 

The first measure allows a comparison to be made between the lengths of the two users' trips. 
On the D2 subset, we find on average = 0.3, i.e. one user travels about 3 times less than the other 
one. Due to the asymmetric behavior in the length of trips, we question whether the shorter trips are 
always taken by the same user, or if the two users share the short trips. The third measure ri allows an 
evaluation of the asymmetry at the pair level, showing an average of 0.06. This suggests that in 94% of 
the selected pairs, there is one user who constantly travels less than its peers. The second measure r/i 
is a directed measure and is computed to see whether geographical and communication differences allow 
the user that travels less to be predicted. Text S3 reports how these measures vary with homes distance, 
population density, normalized tie strength and call direction. In particular we find that as users' homes 
distance increases, co-locations occur in a place that is closer to one of the users. Moreover, the more 
the normalized tie strength differs between users, the more the co-locations occur in places close to one 
of them. 

Our definition of distance d is based on the Euclidean distance between home and co-location places. 
Two limitations arises from this choice: 1) the Euclidean distance does not take into account the real 
path taken by a person; 2) the person might not travel directly from home but the origin of the trip to 
the co-location place could be different. However, as we are interested in the relative distances traveled 
by the two peers, we can assume that both limitations affect the two measures in a similar manner, thus 
limiting the potential bias. 

We evaluate the relationship between the home locations' distance and the number of co-locations 
between users. Figure 4(a) shows the average number of co-locations, which decreases with distance. 



The result is consistent with what was found in 12 , 25 -127 using data from surveys. If we compare this 



decrease with the one of phone calls, and total call times (see Figure 4(a)) we find different decays with 
distance. Total call time is the least affected by distance (slope -0.04), followed by the number of calls 
(slope -0.07). In contrast with this, the number of co-locations is strongly affected by distance (slope - 
0.14). Even if we consider a broader definition of co-location, in which two users are considered co- located 
in the same cell tower if they happen to make a phone call (not necessarily to each other) from the same 
cell tower area within one hour, we still find a similar decreasing trend, as shown in Figure 4(b) computed 
for the D2 subset. The results are consistent with those from the analysis of fixed phone data combined 
with interviews showing the effect of distance on call duration and frequency of meetings |28l|29|. 
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The number of calls has a strong influence on the number of co-locations, suggesting that the more 
people call each other, the more they co-locate (see Figure 5). As there appears to be a clear relationship 
between call patterns, distance and co-locations, we tried to built a predictor of the number of co-locations, 
starting from a measure of interaction (number of calls) and the geographical distance between users' 
home, obtaining = 0.61 with the model (Figure 6): 

^colocations = 0.92- ^r-^. 

This result suggests that geography and telecommunication interactions account for 61% of variations in 
the number of co-locations (see also Text S4). This is consistent during the one year time frame under 
analysis, as reported in Text S5. The exponent 0.60 for the #ca//s reveals the correlation between an 
increase in the number of calls and an increase in the number of co-locations. This result suggests that 
telecommunications might play a complementary role in facilitating face-to-face interactions, supporting 
the observations found in other studies [llfsl. 



Discussion 

In this study we analyze one year of telecommunications data from a large European cellphone operator 
to investigate the relationship between people's calls and their physical location. 

We discover that more than 90% of users who called each other have also shared the same space (cell 
tower), even if they live far apart. Moreover, we find that 69% of users who call each other frequently (at 
least once per month on average) have shared the same space at the same time - an instance that we call 
co-location. Co-locations appear highly indicative of coordination calls occurring just before face-to-face 
meetings. We are able to predict 61% of variations in the number of co-locations from the number of 
calls, and users' homes distance. In particular, as the distance between homes increases, the expected 
number of co-locations decreases. 

We also characterize the co-location places in terms of distance from the home locations. As the users' 
homes distance increases, co-locations occur in a place that is closer to one of the users. In more than 
90% of the cases, co-locations take place in an area that is closer to the same user of the pair (there is 
low reciprocity in the travel distance covered). Telecommunication strength helps predict which person 
of the pair travels less. 

We believe that the above results suggest new ways to use CDRs to investigate the old conundrum 
of the interplay between telecommunications, travel and face-to-face meetings - with applications in the 
social sciences, urban planning and transportation studies. 
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Figure 1. Probability that two reciprocal calling users have shared the same cell tower area during a 1 
year time (Dl subset). 
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Figure 2. Average length of a call as a function of the users' home distances (Dl subset). 
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Figure 3. Number of calls between consecutive co-locations. Call times have been normalized to the 
range of to 1 (D2 subset). 
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Figure 4. Normalized number of co-locations, calls, total call duration as function of users' home 
distances. 
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Figure 5. Average number of co-locations as a function of the number of calls (Dl subset). 
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Figure 6. A prediction of the number of co-locations. Error bars represent the standard deviations 
(Dl subset). 



